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Abstract 

Data assimilation schemes are confronted with the presence of model errors arising from the imperfect 
description of atmospheric dynamics. These errors are usually modeled on the basis of simple assumptions 
such as bias, white noise, first order Markov process. In the present work, a formulation of the sequential 
extended Kalman filter is proposed, based on recent findings on the universal deterministic behavior of 
model errors in deep contrast with previous approaches (Nicolis, 2004). This new scheme is applied in the 
context of a spatially distributed system proposed by Lorenz (1996). It is found that (i) for short times, the 
estimation error is accurately approximated by an evolution law in which the variance of the model error 
(assumed to be a deterministic process) evolves according to a quadratic law, in agreement with the theory. 
Moreover, the correlation with the initial condition error appears to play a secondary role in the short time 
dynamics of the estimation error covariance. (ii) The deterministic description of the model error evolution, 
incorporated into the classical extended Kalman filter equations, reveals that substantial improvements of 
the filter accuracy can be gained as compared with the classical white noise assumption. The universal, 
short time, quadratic law for the evolution of the model error covariance matrix seems very promising for 
modeling estimation error dynamics in sequential data assimilation. 
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1 Introduction 

The problem of estimating the state of an evolving system from an incomplete set of noisy observations is 
the central theme of the classical estimation theory and it constitutes the object of the filtering procedures 
(Jazwinski, 1970). The filtering theory finds its natural applications in experimental and applied physics in 
general as well as in engineering, where the aim is to extract as much information as possible about a natural 
or laboratory scale phenomenon on the basis of limited amount of noisy observations. 

The classical filtering and prediction problem has encountered a natural field of application in the context 
of numerical weather and oceanic prediction, where it is usually referred to as data assimilation (Bengtsson 
et al., 1981; Daley, 1991; Kalnay, 2003). The ultimate goal in this context is to provide the best possible 
estimate of an unknown system's state by using all the information available (Talagrand, 1997). Typically, it is 
based on the information provided by the laws governing the unknown system (i.e. the model) under the form 
of a continuous dynamical system, while the observations are available only at discrete times. In sequential 
assimilation the system's state estimate, given by a solution of the model equations, is updated at each time 
when observations are available. This update is usually referred to as the analysis. The procedure consists 
therefore of a sequence of analyses performed at observation times and of integrations of the model between 
successive analyses (Talagrand, 1997). 

It is well known that in the case of a linear dynamics, and a linear relation between observations and the 
system's state variables, the filtering problem can be expressed via the Kalman filter (KF) equations (Kalman, 
1960; Jazwinski, 1970). The KF formulation is an elegant and comprehensive mathematical description: a 
closed set of equations providing the optimal linear solution of the filtering problem, that is to say the state 
estimate and the associated error covariances, in the hypothesis that the observation and the model error are 
Gaussian, whitc-in-time, and mutually uncorrelated. 

For nonlinear dynamics, an extension of the KF formulation has been proposed, the extended Kalman filter 
(EKF) (Jazwinski, 1970), which has been largely studied in geophysical contexts (see e.g., Ghil, 1989; Ghil 
and Malanotte, 1991; Miller et al., 1994). In the EKF the system state evolves according to the full nonlinear 
dynamics, while the associated error covariance are propagated in time through the linearized dynamics. As 
long as the dynamics of the estimation error is well approximated by the tangent linear equations, the EKF has 
proven to be quite efficient (Miller et al, 1994; Yang et al., 2006) while preserving one of the most attractive 
feature of the KF equations, namely the time propagation of the error covariance, which is desirable especially 
in the case of chaotic dynamics where the error evolution is conditioned by the local instabilities of the flow. 

Besides the computational problems and the intrinsic inaccuracy related to the use of a linearized model for 
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propagating the error statistics, another fundamental issue in the applications of KF-like sequential assimilations 
is the assumption on the model error. Indeed, models used for meteorological and climate prediction still have 
large and unknown deficiencies, including inaccurate parameterizations of physical processes not accounted for 
in the model as well as error in the numerical integration scheme. In this circumstance, prediction errors are 
essentially due to the use of inadequate models and their chaotic dynamics leads to the rapid amplification of 
these errors as well as of initial condition ones. Modern, advanced, data assimilation techniques have tackled 
this latter problem so that progressively more accurate initial condition for environmental prediction are made 
available (for a comprehensive description of data assimilation development see Daley, 1991; Kalnay, 2003). 
Even though it is still not possible to eliminate model deficiencies, a number of solutions have been proposed 
in recent years to estimate and/or account for model error in data assimilation. The state augmentation 
method (Jazwinski, 1970) was primarily introduced in the context of Kalman filtering. In this method, the 
state estimation problem is formulated in terms of an augmented state vector which includes, along with the 
state estimate, a set of parameters used for the model error representation. This approach has been applied 
successfully in both KF-like and variational assimilation (see e.g. Zupanski, 1997; Nichols, 2003; Zupanski 
and Zupanski, 2006). Dee and da Silva (1998) proposed an algorithm to estimate and remove the biases in 
the background field in a data assimilation system due to additive systematic model error. The method was 
implemented for the bias correction of the humidity analysis component of the Goddard Earth Observing System 
(GEOS) assimilation system (Dee and Todling, 2000) and more recently in the context of the European Center 
for Medium Weather Forecast (ECMWF) ocean data assimilation (Balmaseda et al., 2007). 

A key ingredient of the state augmentation technique is the definition of a " model" for the model error (see 
e.g. Nichols, 2003). Recently Zupanski and Zupanski (2007) have described the model error evolution using a 
first order Markov process. A similar assumption was already used in Daley (1992) to investigate the impact 
of time correlated model errors in Kalman filtering. He showed the detrimental effect on the analysis accuracy 
of the absence of time correlation in the description of the model error evolution. More generally these studies 
have highlighted the urgent requirement of a deeper understanding of model error dynamics. 

In recent years, a significant body of work on model error dynamics appeared, in particular in relation 
to predictability studies (see e.g. Reynolds ct al., 1994; Orrel et al., 2001; Vannitscm and Toth, 2002). In 
particular, the foundations of the dynamics of deterministic model errors have been laid in Nicolis (2003) and 
Nicolis (2004). In these works, some generic features of model error dynamics such as the existence of an 
universal quadratic law for the short time mean square error evolution, whose extent of validity is related to 
the Lyapunov spectrum of the underlying dynamics, have been highlighted. The relative roles of deterministic 
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dynamics and stochastic error sources terms have also been investigated (Nicolis, 2004). 

These results are the starting point of the present study whose ultimate goal is to investigate if these 
generic features found on the deterministic model error dynamics can be efficiently incorporated in the EKF. 
The purpose here is twofold. First, we investigate the dynamics of state estimation error when both initial 
condition and model errors are present, and second, an EKF formulation which takes into account the model 
error dynamics is examined. These questions are analyzed in the context of a low order chaotic system (Lorenz, 
f 996) in which model error originates from an inaccurate specification of some of the model parameters. 

The paper is organized as follows. In Sect. 2 the formulation of the problem is presented and deterministic 
evolution equations for the mean and covariance error are derived, along with suitable short time approximations. 
The results are used to describe model error evolution within the analysis intervals and lead to a formulation of 
the EKF in which the deterministic character of the model error is accounted for (Sect. 3). Section 4 contains 
the results of the numerical analysis; the first part describes the experiments related to the analytical derivation 
of Sect. 2 on the estimation error evolution, while in the second, the EKF with the deterministic model error 
description is tested and compared with the white noise model error formulation. Conclusions are drawn in 
Sect. 5. 



2 Error dynamics in the presence of initial condition and model 
errors 

In this section a general equation for the evolution of the error associated to the estimate of the state of a system 
is derived, based on the use of an imperfect model of the system's dynamics, and an approximate knowledge of 
its initial state. 

Let the (unknown) "true" dynamics, the nature, be represented in the form: 

^ = f(y(*),A) (i) 

The state vector, y, lies in an /-dimensional vector space, the governing law f , typically a nonlinear function, 
defined in 1Z 1 and A is a P-dimensional vector of parameters. The function f can have an explicit dependence 
on time but it is dropped here to simplify the notation. 

Let x be the variable of the model at our disposal obeying: 

^=g(x(t),A'), (2) 
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where g is a nonlinear function. Typically, g is denned on a phase space which is different from f in Eq. (JTJ) , 
and A is a Q-dimensional vector (Q ^ P). 

The error in the estimate of y, referred to as model error, originated by the approximate description of the 
true dynamics, Eq. (0). Different sources of model error are possible such as, for instance, errors related to 
an inadequate description of some physical processes and/or to the limited representation of relevant scales. In 
this study, we focus on the situation in which the model and the true trajectories span the same phase space. 
Model error is due only to uncertainties in the specification of the parameters appearing in the evolution law g. 
This formulation accounts, for instance, for errors in the description of some physical processes (dissipations, 
external forcing, etc.) represented by the parameters. The role of the unresolved scales in the model dynamics 
and the consequence for the data assimilation will be addressed in a future work. 

The model state and parameter, x and A , are therefore /-dimensional and P-dimensional vectors respec- 
tively, and the evolution equation ^ can be rewritten as: 

f=f(x,A') (3) 

An equation for the evolution of the state estimation error <$x(£) = y(t) — x(t) can be obtained by taking 
the difference between Eqs. ([T]) and The evolution of 6x(t) depends on the error estimate at the initial 
time t = t,Q (initial condition error <5x(io) = <5xo) and on the model error. If <5x is "small", the linearized 
dynamics provides a reliable approximation of the actual error evolution. The linearization is made along a 
model trajectory, solution of Eq. Q, by expanding, to the first order in <5x and SX(= A — A ), the difference 
between Eqs. Q and (J3j) : 



dSx as. . as 



— Ix(^x+— \ x ,6\ (4) 



The first partial derivative on the rhs of Eq. ^ is the Jacobian of the model dynamics evaluated along its 
trajectory. The second term, which corresponds to the model error, will be denoted <5/i hereafter to simplify the 
notation; <5/i = |^| A '(5A. 

The solution of Eq. (j4|), with initial condition <5x at t = to, reads: 

Jx(t) w M iito (5x + f dTM t , T 6n(T) 
J to 

= 5x lc (t)+5x m (t) (5) 

with M^tp being the fundamental matrix (the propagator) relative to the linearized dynamics along the tra- 
jectory between to and t. We point out that <5/i and M t T in ([5]) depend on r through the state variable x. 
Equation ([5|) states that, in the linear approximation, the error in the state estimate is given by the sum of two 
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terms, one relative to the evolution of initial condition error, 5x IC , and another one relative to the model error, 
<5x m . The presence of the fundamental matrix M in the expression for i5x m suggests that similarly to what 
occurs for the initial condition error, the instabilities of the flow play a role in the dynamics of model error as 
well. 

Let us now apply to Eq. {5]) the expectation operator defined locally around the reference trajectory, by 
sampling over an ensemble of initial conditions and model errors: 



where po and p rn are the probability density function for the initial condition and the probability density of 
the model error respectively. This definition is consistent with the expectation operator usually used in state 
estimation algorithms. We then get the evolution equation for the mean estimation error along a reference 
trajectory: 



In a perfect model scenario an unbiased state estimate at time to (< <5xo >= 0) will evolve, under the 
linearized dynamics, into an unbiased estimate at time t. This is not necessarily true when model error is 
present and, depending on the properties of model error, an initially unbiased estimate can evolve into a biased 
one. The important factor controlling the evolution of the mean state estimation error is the model error mean 
< S/j,(t) >. In view of the hypothesis made above on the type of model error, the latter is expressible as a 
function of the model variables and of the parametric error SX, so that in the general case we expect < Sp(t) > 
to be different from zero. This fact has important implications for classical least-square (or Bayesian) based 
data assimilation algorithms which are derived assuming that the errors associated to each piece of information 
entering the analysis update are Gaussian and unbiased (Talagrand, 1997). In that context, if the bias is 
not properly accounted for and removed, the resulting analysis state will be biased (Dee and Da Silva, 1998). 
Equation (J7|) provides a way to estimate the bias due to an incorrect specification of the model parameters. 

Formal expressions for other moments of the unknown error probability density function (PDF) can be 
derived similarly. Since the focus here is on data assimilation related problems, we will make the simplifying 
assumption that the underlying error PDF is (or is close to) Gaussian. As a consequence the full PDF is mainly 
characterizable by only its first and second moments: the mean and the covariance. 

The evolution equation of the state estimation error covariance matrix can be obtained by taking the 
expectation of the external product of dx(t) by itself. As above, the expectation is made over a large sample 




(6) 



< 5x(t) >« M Mo < 5x > + / drM t:T < 8^{t) > 




to 



=< Sx lc > + < 5x m > 



(7) 
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of initial conditions and model errors, assuming that the estimation error biasi is known and removed from the 
background error field, and leads to: 

P(t) =< (5x(t))(5x) T (t) > 
« P tc (t) + P rn {t) + P corr (t) + {P corr ) T (t) (8) 

where: 

P lc (t) = M Mo < (<5x )(<5x ) T > Ml to (9) 
P m (t) = f dr f dr'M ttT < (5/x(r))(^(r')) T ) > Mj r< (10) 

Jt ./to 



(11) 



The four terms of the r.h.s. of Eq. (jHJ) depict the evolution of the initial condition error covariance, the 
model error covariance and their cross correlation matrices, respectively. The evolution of the quadratic error is 
obtained by taking the trace of the corresponding covariance matrix. It is interesting to note that the net effect 
of the correlation between initial condition and model error may result in a reduction of the total estimation 
error. 

We now turn briefly to the situation in which the model error, Sfi, is assumed to be an additive ran- 
dom disturbance. In this case, Eq. (H|) takes the form of a stochastic differential equation whose solution 
depends on the property of the random process itself. For white noise Gaussian process < Sfi >— and 
< (6fi(t))(6fi(t )) T >= Q5(t — t ), where Q =< (6 fi(t)) (6 fi(t)) T > is a positive definite matrix representing the 
covariance of the process and 8(t — t ) is a Dirac delta function. 

In analogy to the case of deterministic model error, we can derive an expression for the evolution of estimation 
error covariance matrix when the model error is treated as a white noise process, by taking the expectation 
value of the external product of Eq. {5| by itself: 

P wn (t)=P^{t)+PZ n {t) (12) 

with P tc the same as in Eq. ([9]), and P™„ representing the model error covariance matrix in this white noise 
case: 

i-t ft 



P"n(*)=/ / dTdTM t , T <5 i l{TW(T) T )>M T tT , 
J t a J to 

= f M ttT QMl T dT (13) 

Jt n 



with the subscript "am" standing for "white noise" 
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We finally point out that the mean and covariance matrix evolution equations derived so far, are based on 
the expectation operator defined in Eq. (0). This means that we implicitly assume to have a reference tra- 
jectory, and to be interested in exploring the distribution of possible errors in the estimation of the trajectory 
itself. This is typically the case in data assimilation applications when one is usually interested in evaluating 
the error associated to the estimate of a given trajectory, solution of a model integration. The same averaging 
procedure can then be repeated by sampling over an ensemble of states over the system's attractor (through 
the probability density function associated with the model attractor) in order to get information independent 
of the initial conditions as usually done in traditional analysis of error dynamics (Nicolis, 1992). 

Short time error evolution 

Let us now turn to a short-time approximation of Eqs. (O, ® and (|13|) . 

We proceed by expanding Eq. ([5]) in Taylor series, up to the first non trivial order, only for the model error 
term <5x m while keeping the initial condition term, 5x. lc , unchanged. 

In this case, the model error 5x m evolves linearly with time according to: 

<5x m «<5 Mo (t-to) (14) 

where S(i(to) = 5/jq- 

By adding the initial condition error term <5x lc , we get a short time approximation of Eq. (|5j): 

5x(t) «M 4 , to <5xo + <5Mo(*-to) (15) 

For the mean error we get: 

< 5x(t) >« M tlto < (5x > + < Sua >(t- t ) (16) 

As already noted by Nicolis (2003), the mean model error evolves linearly in time as long as the average < 5fio > 
is different from zero, otherwise the evolution is conditioned by higher orders of the Taylor expansion. 

We remark that the two terms in the short time error evolution Eqs. (fT5")) and p^|) . are not on equal footing 
since, in contrast to the model error term, which has been expanded up to the first nontrivial order in time, 
the first r.h.s. term (the term describing the initial condition error evolution) contains all the orders of times 
(t,t 2 ,...,t n ). The point is that, as explained in the sequel, we intend to use these equations to model the error 
evolution in conjunction with the technique of data assimilation. As discussed in Section 3 in this technique it 
is customary to proceed with the full matrix M as it appears in Eqs. (|15p and (|16p . We stress however that 
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since the assimilation intervals of a data assimilation cycle are short effectively only the first terms of a Taylor 
expansion of M are relevant. This secures the overall consistency of our evaluation. 
Taking the expectation value of the external product of Eq. ([15]) by itself, we get: 

P(t) « M Mo < (<5x )(,5x ) T > Ml tQ + 
+ [< (^ )(<5x ) T > Mf >t0 + M Mo < (5x )(^o) T >] (t - t ) 

+ < (5 M o)(^Mo) T > {t-tof (17) 

Equation (|17p is the short time evolution equation, in this linearized setting, for the error covariance matrix 
in the presence of both initial condition and model errors. Note that while model error is bound to evolve 
quadratically, in agreement with the results of Nicolis (2003) , the correlation errors behave linearly with time 
and these terms may have a compensating effect resulting in a reduction of the total error. The extent of 
validity of the quadratic short time regime of the mean square error is related to the largest (in absolute value) 
exponent (the most negative one for the large class of dissipative systems), see Nicolis (2003). This result is 
also confirmed by the numerical results presented below. 

A case which can have practical relevance, is obtained when the model error is independent of <5xo (error in 
the specification of external source parameters, for instance) and of unbiased initial condition errors. In these 
circumstances Eq. (fT7| reads: 

P(i)«M t , tn <(£x )(£x ) T >M^ o 

+ <(S^o)(S^f >(t~t ) 2 (18) 

and the correlation terms cancel. 

A similar relation can be built when considering a white noise model error term. Developing the second 
term of Eq. (|12p in Taylor series and keeping the dominant order alone one gets: 

P(t) « M t , t0 < (5x )(5x ) T > M£ t0 + Q(f - t ) (19) 

By comparing Eqs. (|18| and (|19p we conclude that while the model error covariance matrix evolves linearly 
with time if the model error acts as a white noise process, it is bound to evolve quadratically, for short times, 
in the case of deterministic model error. 

Finally, as discussed previously, the average procedure at the basis of Eqs. (fT6|) and (fT8|) . can also be 
performed by averaging over an ensemble of states on the system's attractor (symbolically represented by 
<< . >>), leading to: 

<< 5x(t) >>«<< M Mo <5x >> 
9 



+ «8fi » (t-t ) (20) 
P(t) w« (M Mo <5xo)(M Mo 5x ) T » 

+ « (6fx )(Sfi ) T »(t~t ) 2 (21) 
Relations (l20l) and (I2T1) are used in Sect. 4. 



3 Extended Kalman filter in the presence of model error 

In this section we first revise the classical EKF equations, then a formulation of the filter in which the deter- 
ministic dynamics of the parametric model error is accounted for is presented. 

Let assume that a set of M < I noisy observations of the true system fl}, stored as the components of an 
Af-dimensional observation vector y°, is available at the regularly spaced discrete times t k = to + kr, k = 1, 2..., 
with t being the assimilation interval; that is: 

Y° k = H(y k ) + e k (22) 

where e is the observation error, assumed to be Gaussian with known covariance matrix R and uncorrelated 
in time. Ti. is the (possibly nonlinear) observation operator which maps from model to observation space 
(i.e. from model to observed variables) and may involve spatial interpolations (or spectral to physical space 
transformation in the case of spectral models) as well as transformations based on physical laws for indirect 
measurements (Kalnay, 2003). 

It is convenient to write the model equations as a discrete mapping from time t k to tk+i' 

4 +1 = M*t (23) 

and x a being the forecast and analysis states respectively, M. the nonlinear model forward operator (the 
resolvent of Eq. ©). 

For the EKF, as well as for most least-square based assimilation schemes, the analysis update equation is 
(Jazwinski, 1970; Daley, 1991): 

x£ = [I - K k H k ] 4 + K k y° k = x{ + K k d k (24) 
where d = y° — HxJ is the M-dimensional vector of innovation. The gain matrix K is given by: 

K = P^H r [HP^H T + R] _1 (25) 
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where is the I x I forecast error covariance matrix and H the linearized observation operator (i.e. a M X I 
real matrix); we use the unified notation given in Ide et al. (1997) and the temporal index k has been omitted in 
Eq. (|25p for clarity. The essence of an assimilation scheme is embedded in the gain matrix: it characterizes the 
algorithm, the way it "uses" the observations. The characteristics of the observational network are described 
through the observation operator TL. 

In the EKF, the forecast error covariance matrix, P-^, is obtained by linearizing the model around its 
trajectory between two successive analysis times tk and tfc+i. Similarly to what has been already done in 
relation with Eq. ([5]), the forecast error at time ifc+i, ^ x {+i: can be approximated as: 

5x{ +1 =M fe+1 , fe <hcg + S X f +1 (26) 

The model error term Sx 1 k n +1 can be thought to represent the sum of all the contributions to the state vector 
which are not accounted for by the tangent linear propagator. 

Assuming that the model error is uncorrelated with the analysis error, the evolution equation of the forecast 
error covariance matrix within the assimilation interval is given by: 

P{ +1 - M fe+1)fe P£M£ +u + Pl" +1 (27) 

The analysis and forecast error covariance matrices are related through the gain matrix, according to: 

PI = [I - K fe H fe ]P{ (28) 

Expression (|2"5)) and ((2"7f are usually referred to as the EKF prediction equations, while Eqs. |2~l|) . (|25p and 
(|28| as the EKF analysis equations. If all hypotheses are verified, the analysis given by Eq. ([24| corresponds 
to the minimum variance, unbiased, estimate of the true system's state. 

According to the discussion in Section 2, if the model error is an additive white noise, its impact on the 
estimation error is related to the covariance of the random process, the matrix Q. Consequently, assuming to 
have access to (or to an estimate of) Q, the model error covariance matrix in Eq. (|27p can be estimated through 
the short time linear approximation, Eq. (fT§|) . as P m = Qt. 

On the other hand, the short time evolution of the deterministic model error covariance is bound to be 
quadratic. In this latter case, by neglecting the correlation terms, the model error covariance matrix can be 
approximated as P m = Qt 2 , where Q is the covariance of the uncertainties associated to the deterministic 
model error. Thus, the two estimates of P m based respectively on the white noise or on the deterministic 
process assumption, differ by a multiplicative factor equal to the assimilation interval r. 

Given these premises, we investigate the possibility of using the short time approximation, Eq. (|18p. to de- 
scribe the error evolution between assimilation times along the EKF analysis cycle. As long as the observational 
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forcing is frequent enough (small r) and the error is efficiently reduced by the assimilation of observations, the 
short time error dynamics should provide a reliable description of the actual model error evolution between 
two successive analysis. Similarly, Eq. (|16|) may be used in an EKF analysis cycle to estimate the bias in the 
forecast error; this bias can then be removed from the forecast field before the latter is used as the background 
in the analysis update, Eq. (f2"4")) . 

Let us suppose to have access to a statistical information about the parametric error of our model, under 
the form of the mean < 5/j, > and its covariance Q =< (Sfi— < dfi >)(8fi— < <5/i >) T >. At the analysis times, 
the forecast bias due to the model error can be estimated according to the short time approximation, Eq. (|16p . 
that is: 

< <Sx m >=< 5fi > (t k +i - t k ) =< Sfi> t (29) 
The bias can then be removed from the forecast field before the EKF analysis is performed: 

x{ = x{- < 5x m > (30) 

The new background state x{ is finally used for the analysis update, Eq. (|24|) . 

Similarly, the model error covariance matrix can be estimated with the constant matrix: 

P m = P£ «< (^M- < >)(<5m- < <^ >) T > r 2 (31) 

where the suffix "dp" stands for deterministic process. Or alternatively in the case of the white noise model 
error assumption, 

P m = P™„ «< (ty- < 8p >)(6n- < 6ft >) T > r (32) 

These formulations of the EKF in which model error is treated as a deterministic process or a white noise 
are tested numerically through the implementation of Observation System Simulation Experiments (OSSE) 
(Bengtsson et al., 1981), see Section 4.2. 

In both formulations the bias is removed, Eqs. (|29p and (|30[) . Strictly speaking, if the model error is a white 
noise, its mean is identically zero, and no model related bias should be present. In this case, the matrix P™„ 
can be interpreted to only account for the variability of the model error around its mean. 

Note that in practice, the statistical information on the model error, < Sfi > and Q, are not easy to 
evaluate. In the present work, since the origin of this model error is known, we perform the statistics over the 
whole attractor in order to get an invariant estimate of the model error moments. 
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4 Numerical experiments 

In the first part of this Section we study the evolution of the estimation error in the presence of errors in the 
initial conditions and in the model parameters. We focus on the investigation of the accuracy of the short time 
error and error covariance dynamics, Eqs. (TlB|) and (JT7J) respectively. In the second part of the section, OSSEs 
are performed with the aim of testing the formulation of the EKF in which the dynamics of the model error is 
accounted for. 

These questions are investigated in the context of a low-order atmospheric model giving rise to chaotic 
dynamics. The model, introduced by Lorenz (1996), possesses 36 scalar variables representing the values of 
some meteorological quantity a;,-, i — {1, ...,36}, along a latitudinal circle. The evolution equations read: 

^ = a(x l+ i - Xi-2)Xi-x - (3xi + F 

= /(x,a,/3,F), i = {l,..,36} (33) 

The quadratic terms simulate the advection, the linear term the internal dissipation, with a and j3 being the 
advection and dissipation parameters respectively, while F represents the external forcing; a detailed description 
of the model can be found in Lorenz (1996). The numerical integrations have been performed using a fourth- 
order Runge-Kutta scheme with a time step of 0.0083 units, corresponding to 1 hour of simulated time. 

For a = j3 = 1 and F = 8, the model behaves chaotically and the first Lyapunov exponent, computed for a 
period of 200 years after a spin-up of 10 years, is equal to o\ = 0.33 (day) -1 and corresponds to a doubling time 
of about 2 days. Reducing the external forcing, the system becomes progressively more stable (for sufficiently 
small F all solutions decay to xi = ... = x?,q = F); the opposite occurs when the external forcing is increased. 
By increasing/decreasing the internal dissipation the system becomes more stable/unstable as expected. When 
the advection coefficient is modified the stability properties do not vary significantly. The asymptotic stability 
properties of Eq. (f3"3"|) . for a set of parameters values are summarized in Tab. 1. This set of parameters values 
are used in the numerical experiments, while the true dynamics is represented by the model with the canonical 
values a = f3 = 1 and F = 8. The amplitude of the parameter error (20%) has been chosen as the minimum 
value which lead to sensible deterioration of the quality of the model prediction. The focus is placed on the two 
model configurations given in the last two lines of Tab. 1 (hereafter referred to as "C/" and "Cn" , respectively), 
since in these cases all the three model parameters are modified simultaneously. 

According to Eq. (Jl]), the state dependent model error can be written as: 

<5/ii(x, AA) = Aa(x i+1 - Xi-^)Xi-\ - A(3xi + AF 



13 



» = {1,...,36} (34) 

where the terms Ace, A/? and AF are the errors in the specification of the advection, dissipation and external 
forcing respectively and AA = A — A = (Aa, A/3, AF), and the true parameters are A = (a, (3, F) — (1, 1, 8). 



4.1 Error dynamics in the presence of parametric model error 

The experiments described in this section are designed to investigate the accuracy of the linearized, short time 
approximations, Eqs. (j20|) and (f2"Tj) . As stated above, the true dynamics is given by the model, Eq. (|33|) . with 
a = j3 = 1 and F = 8. We study the quality of the state estimate prediction, when the approximate model is 
described with the parameters of Tab. 1. 

A sample of 10 initial conditions over the true system attractor is generated through a long time integration. 
A set of 10 5 model initial conditions are produced such that the initial error is <5xo £ A/"(0, Co); <7o is fixed to 
10% of the system's natural variability. 

In Fig. [1] the mean error evolution is compared with the short time approximations, Eqs. (|20|) and (1211) . 
The mean actual error along with the mean initial condition error (first r.h.s term of Eq. (|20[0 and the total 
error, model plus initial condition, are displayed for the configurations Cj and Cjj. The error fields are shown 
as a function of the model gridpoints at times t = 0, 12 and 24 hours. The results indicate that the approximate 
linear evolution, Eq. (|20p . is able to describe the actual mean error evolution within a good level of accuracy 
up to t = 12 hours, with a discrepancy between the actual error and its estimate being of the order of 10% of 
the error amplitude. At t — 24 hours, the linear approximation becomes less accurate. Note that the prediction 
error is dominated by the model error. Similar results (not shown) are obtained when choosing other parameter 
values listed in Tab. 1. 

Figure [2] shows the actual state estimation error variance (the trace of the covariance matrix), along with the 
initial condition error variance (the trace of P lc ) and the short time error variance evolution, modeled by Eq. 
(|2ip . The eight plots refer to experiments with different parametric errors, according to the values given in Tab. 
1 . The figure indicates that in most of the cases Eq. (f2"Tj) provides a more accurate description of the actual 
quadratic error than the initial condition term alone. The estimation error variance due to model parameter 
misspecification grows quadratically for short times, up to a time proportional to the inverse of the most negative 
Lyapunov exponent (see Tab. 1), in agreement with the theory (Nicolis, 2003). For A/3//3 = 20% the addition 
of the quadratic model error evolution does not lead to significant difference while for A/3//3 = —20% it slightly 
deteriorates the prediction. Note anyhow that for A/3//3 = ±20% the impact of the model error is very small 
during the 24 hours prediction. Furthermore results of Fig. [5] suggest that the additional terms, accounting for 
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the correlation between model and initial condition error and that are not included in the estimate dedscribed 
in the figure, do not contribute much to the error dynamics in the situations considered. 

Figure [3] depicts the evolution of the error covariance between x\ and and between X20 and X30 for the 
cases Cj and C/j. We see that, although the accuracy is reduced with respect to the case of the error variance 
(Fig. [2]), Eq. (|21[) appears still able to provide an accurate approximation of the actual error dynamics. 

Some recent advanced data assimilation techniques are designed to track and control the instabilities which 
grow along a trajectory solution of a data assimilation cycle and try to optimally use the observations available 
to reduce the portion of the actual error which projects on the unstable direction of the system (Carrassi et 
al., 2007). The basic paradigm is that a relevant portion of the actual estimation error evolve according to the 
system's unstable subspace dynamics so that reducing the error in this subspace maximizes the overall effect 
of the assimilation. As a result, the analysis error is mainly confined in the complement of the subspace where 
the analysis increment is confined. 

In all the experiments described so far, we have used a random distribution of initial condition errors. In 
relation to the aforementioned advanced data assimilation algorithms, an additional relevant question concerns 
the accuracy of the proposed short time approximate equations, when the initial condition errors (i.e. the 
analysis errors) are confined to the unstable/stable subspace of the model's solution. To this aim experiments 
similar to the ones described above have been performed, except that the sampling of the initial condition error 
is made after an initial transient of 2 years necessary to compute the Lyapunov exponents of the system. The 
latter, as well as the Lyapunov vectors, are estimated using the standard Gram-Schmidt procedure (Benettin 
and Galgani, 1980). The initial condition errors are then sampled with the constraint of being either aligned 
to the first Lyapunov vector or orthogonal to the local unstable subspace (in practice being aligned with the 
direction associated to the most negative Lyapunov exponent) . The results (not shown) for the parametric error 
configurations Cj and Cjj, indicate that although the initial condition error evolution is clearly conditioned by 
constraining the initial condition error to the stable/unstable subspace, the effect on the accuracy and duration 
of the approximate model error quadratic regime is negligible. 

In conclusion, the numerical experiments indicate that, in the short time (within 24 hours), Eqs. (|20p 
and l|2ip may be used to model the mean and covariance error evolution within the assimilation interval in a 
sequential assimilation scheme. 
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4.2 EKF in the presence of parametric model error 

OSSEs with the EKF will next be performed in the context of the Lorenz 36- variables (Lorenz, 1996). An 
homogeneous network of 18 observations is used at which model variables are observed (xi, i = 1,3, ...,35). 
The simulated observations are generated by sampling the reference true trajectory with an unbiased Gaussian 
random measurement error of mean zero and variance a 2 . The observation error variance is er^ = 2.5% of the 
system's climate variance and three assimilation intervals, t, are considered: 12, 6 and 3 hours. The model is 
represented by Eq. (|33[) with the parameters given in Tab. 1. All the experiments are performed for 6 years; 
the first year of simulated time is considered to as a spin-up period so that all results and statistics refer to the 
last 5 years. 

First, we report the performance of the filter, in terms of the accuracy of its state estimate, when a perfect 
model is used and consequently no model error treatment is employed in the EKF (i.e. P m = in Eq. (|27|) ). 
It is known that because of the nonlincarities, even in a perfect model scenario, the EKF solution may drift 
away from the true trajectory (Miller et al., 1994). A few empirical solutions have been proposed in previous 
studies with the EKF in order to reduce or avoid the filter divergence; namely the multiplicative forecast error 
variance inflation (Anderson, 2001) or the addition of random perturbations to the diagonal of the analysis error 
covariance matrix to enhance its explained variance (Corazza et al., 2003; Yang et al., 2006). In our experiments 
with the EKF we have opted for the latter solution; random perturbations S are added to the diagonal of the 
analysis error covariance matrix after the analysis update, Eq. (|28|l : S = ^aa^, with < £ < 1 a random 
number extracted from a normal distribution and < a < 1 a tunable scalar coefficient. The latter, optimized 
by minimizing the average EKF analysis error over 5 years, is equal to 0.2. 

When a perfect model is used, the overall performance of the filter is good. The time mean analysis error 
variance, over the 5 years, is equal to 0.89%, 0.76% and 0.66% of the system climate variance, with assimilation 
intervals equal to 12, 6 and 3 hours respectively. 

We now turn to introduce parametric errors in the model. As a first step, the filter performance without 
model error treatment is analyzed. Figure [4] shows the time running mean of the analysis error variance as a 
function of time, for experiments in which the model is in configuration Cj and C//. The perfect model case 
is also displayed for reference. The impact of model error is dramatic: in the configuration Cj the average 
analysis error is about four times the corresponding error when a perfect model is used, while it is more than 
one order of magnitude in the configuration C/j; in this latter case filter divergence occurs when r = 12 hours. 
In all the cases, the average analysis error is larger than the observation error variance even when the shortest 
assimilation interval, r = 3 hours, is used. Besides the increased mean error level, the curves show abrupt jumps 
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(in particular when r = 12 or 6 hours) indicating that, in the presence of model error, the filter undergoes large 
error fluctuations during the analysis cycle. Large error spikes may be also due to changes in the system regime 
which are not tracked on by the filter solution. Similar experiments, with the EKF assuming a perfect model 
scenario, have been performed for all the model parameters configurations given in Tab. 1 and the corresponding 
time average analysis error are summarized in Tab. 2. As expected, the presence of the model error deteriorates 
systematically the performance of the filter. 

Before implementing the EKF with the model error covariance matrix estimated through Eq. ([3"T|) or (13"^]) . 
the question on the accuracy of the linearized model error evolution within the analysis intervals is addressed 
in a more idealized experimental setting. We assume to know the model error at each analysis time, 5fi a = Sfig, 
so that the model error at forecast time is estimated according to Eq. (fT5|) . that is: 

8y^. 1 «8n%(t k+1 -t k )=8^r (35) 

where t& indicate an arbitrary analysis time along the assimilation cycle; Sfj,^ is evaluated through Eq. (|34|) by 
using the analysis state x°. This estimate of the model error is then removed from the background (forecast) 
field, before the EKF analysis update as for the bias removal, Eq. ([3D]), except that now a time-dependent 
estimate of the model error is used and P m = in Eq. ([2"7f . This simplifies the interpretation of the results 
since, by avoiding the uncertainty associated to the estimate of the model error covariance matrix, the main 
source of errors is the limited accuracy of the approximate model error dynamics Eq. (|35p • The EKF employing 
this time-dependent model error removal has been tested with the model in the configurations Cj and C/j. In 
both cases, the time average analysis error is reduced below the observation error variance. The time average 
analysis errors, in the configuration Cj are 2.34%, 1.02% and 0.71% with assimilation interval as long as 12, 
6 and 3 hours respectively, while in the configuration C/j they are 1.47% and 0.74% with r = 6 and 3 hours 
respectively. Note that in the case Cu and r = 12 hours the filter diverges suggesting that the short time 
approximation of the model error evolution is not sufficiently reliable on this time scale. The overall results are 
encouraging: by employing this idealized time-dependent model error removal the EKF analysis errors attain 
values comparable to those obtained with a perfect model (see the first line of Tab. 2). 

We now turn to investigate the performance of the extensions of the EKF proposed in Sect. 3, in which the 
forecast error covariance matrix, P^, used in the EKF analysis update, includes also a representation of the 
model error covariance. The latter is estimated by Eq. (|32|) for the white noise assumption and by Eq. ([31]) 
for the deterministic case. The P m matrix is kept constant along all the EKF analysis cycle. Furthermore in 
both cases, the bias, estimated through Eq. ([2"§jl . is removed from the background field before the latter is used 
in the analysis update. The model error mean, < 8/j, >, and covariance, < (Sfi— < 6fi >)(<5/i— < 6fi >) T >, 
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to be used in Eqs. (|29|l . (f3"Tj) and (j3"2"|) are evaluated, for each of the parameters values described in Tab. 1, by 
accumulating statistics over a sample of 10 5 initial conditions on the system's attractor. 

Figure [5] shows the running mean EKF analysis error as a function of time, in the white noise and deter- 
ministic process assumptions, with assimilation intervals r = 12, 6 and 3 hours. The plots refer to the model 
parametric configurations Cj and Cn. In both situations, the best performance is obtained when the model 
error covariance matrix is built on the basis of a quadratic error evolution law between assimilation times. In the 
case of the more unstable configuration C/j, the improvement is still more pronounced and, for instance, when 
the model error is treated as a white noise, a 3 hours assimilation interval is needed to reach mean error values 
smaller than those obtained in the deterministic case with an assimilation interval of 12 hours. It is furthermore 
relevant to note that most of the analysis error jumps apparent in the running mean, are significantly reduced 
by employing the deterministic model error description. 

Other analysis cycle experiments have been performed for all the parametric configurations given in Tab. 1 
and the results are reported in Tab. 3 and in Tab. 4 for the white noise and deterministic case respectively. 
For almost all the model parameters considered, best results are obtained when the model error is treated as a 
deterministic process: the only exception is the case in which the actual dissipation parameter is underestimated 
(^p = 20%) and the assimilation interval equal to r = 12 hours. We have seen in Section 4.1 that this is one of 
the case in which the model error has only a minor impact on the prediction error. In this circumnstancc the 
accuracy of the filter solution is mainly related to the accuracy of the forecast error description based on the 
linear propagation of the estimated analysis error. We argue that the addition of a larger model error covariance 
matrix (such as that corresponding to the white noise assumption: P™„ = Qr > P'^ l p = Qt 2 ) helps to better 
preventing possible underestimation of the actual forecast error covariance occuring in the presence of strong 
nonlinearities. 

Figure \§\ illustrates the EKF analysis error variance as a function of time for the experiments of Fig. [5] with 
t = 6 hours; the figure displays the analysis error during the last year of simulated time. As already evident 
from Fig. [5] the average analysis error is lower in the deterministic case. From Fig. [5] we further see that the 
variability about the mean values appears slightly smaller too and, except for a few instances, error peaks are 
all reduced by treating model error as a deterministic process. 

A further illustration of the ability of the filter to track the nature evolution is given in Fig. [7]which shows the 
true value at x = 15, the observation, taken every 6 hours, and the two filter solutions in the case of white noise 
or deterministic model error. The four plots display the field values during the first week of the 1 th , 4 th , 7 th and 
10 th month of the last year of simulated time. There are evidences of instances, particularly in correspondence 
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with the maximum and minimum, in which the filter with the deterministic model error formulation appears 
to provide a smoother solution than in the white noise assumption. In some cases (for instance around day 2 
in the left top panel, or day 271 in right bottom panel) the analysis solution in the deterministic case remains 
closer to the true solution in spite of the large observation error. This indicates the accurate estimation of the 
forecast error accomplished by modeling model error as a deterministic process. The white noise assumption, 
on the other hand, seems to lead to the overestimation of the actual forecast error so that the analysis gives a 
larger, improper, weight to the observations. 

To further examine the quality of the deterministic approach, it is interesting to study the performance of 
the filter by tuning the amplitude of the model error covariance used in the EKF, that is P m = 7P^- This 
allows to evaluate, on an heuristic basis and a-posteriori, how far from optimality is the given model error 
covariance matrix. Figure [5] shows the time average analysis error as a function of the scalar coefficient 7, for 
the three different assimilation intervals, r = 3, 6 and 12 hours and for the configuration C77. Remarkably, for 
all the assimilation intervals considered, the minimum average analysis error is attained when P m = P^. For 
reference, the values corresponding to the EKF under the white noise hypothesis are labeled in the figure to 
allow a direct evaluation of the improvement gained by employing a deterministic representation of the model 
error. Note that, once an optimal model error covariance matrix has been estimated for a given assimilation 
interval, the optimal amplitude of the model error covariance matrix can be estimated on the basis of the 
quadratic time evolution for any (short) assimilation interval. 

Our analysis has so far focused on examining the performance of the filter with different analysis intervals. 
This was mainly motivated by the interest in exploring the range of validity of the short time approximation on 
which the model error formulation relies. The following experiments, on the other hand, are aimed at a sensitivity 
analysis. The robustness of the EKF is examined by modifying the properties of the observational network, 
more specifically the observation error amplitude and the number and location of the observations. Figure [5] is 
obtained by using the same homogeneous distribution of 18 observations used so far, but the observation error 
variance is varied from 0.5% to 5% of the system's climate variance. The model is in the configuration C/j. 
Three types of EKF analysis cycles are compared: (?) a perfect model scenario, (ii) a white noise model error 
and (in) a deterministic model error. The figure shows the time average analysis error over the last 5 years of 
the experiments as a function of the observation error variance, with an assimilation interval r = 12, 6 and 3 
hours. For all the observation error values and assimilation intervals considered, best performances are obtained 
when model error is treated as a deterministic process. In the case the model error is not accounted for, and 
the assimilation interval is 12 hours, the filter diverges for all the values of observation errors. Note that by 
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reducing the observation error, the relative difference between the random or deterministic approaches reduces 
too. This is reasonable since, when observation error is very small, the resulting analysis is very accurate and 
the forecast error covariance matrix P-^ is dominated by the model error component P m . In this condition, 
p/ pm p anc [ t ne analysis does not depend on the amplitude of the forecast error covariance matrix, as 
if the observations were perfect, (see Eq. EH]) . 

In Fig. Uni the time average analysis error is plotted as a function of the number of observations. In these 
experiments, a random distribution of observations is deployed; their number ranges from 14 to 32 and for 
each of these values, 10 random distributions of observations are generated, and analysis cycles with the EKF 
are performed. The results are averaged in space and time as well as over the 10 observation distributions. 
Observation error variance is fixed to 2.5% of the system climate variance. The deterministic model error 
treatment is still the best for all the cases considered. It is interesting to note the increasing advantage gained 
with the deterministic assumption when the number of observations is increased too: it seems that, as long as 
the observing network is refined, the average estimation error is reduced and the linear hypothesis on which the 
matrix P^ is built becomes more accurate. 

5 Summary and Conclusion 

Data assimilation constitutes nowadays a central part of the operational weather forecasting system. The 
purpose is to get an estimate of the state of the atmosphere as close as possible to reality and compatible with 
the simulating model at hand. To this end accurate information on the observations and on the model are 
necessary. A central problem is the discrepancy between the model and the true dynamics, the model error, for 
which simplistic assumptions are usually made such as, for instance, that of a white noise process. 

In the present work a different approach has been adopted, based on recent advances made on the dynamics 
of model errors. Our analysis does not make use of any a-priori assumption on the model error dynamics except 
its deterministic character implying the existence of a short term universal behavior of model errors deduced in 
Nicolis (2003; 2004). The analytical development of evolution equations for the mean and covariance error, in 
the presence of both initial condition and model uncertainties, has shown the way this deterministic approach 
affects the structure of the classical extended Kalman filter equations. 

The theoretical analysis is fully supported by numerical experiments in the context of a low order atmospheric 
model giving rise to chaotic behavior (Lorenz, 1996). Model errors are explicitly introduced in this system by 
perturbing the parameters with respect to some reference true values. It is shown that in the short time (less 
than 24 hours) the estimation error is accurately approximated by an evolution law in which the model error, 
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treated as a deterministic process, is expanded in a Taylor series in time up to the first nontrivial order. The 
correlation between model and initial condition error has only a minor impact on the short time evolution of 
the error covariance matrix. Furthermore the possibility of using the deterministic approach to account for the 
model error dynamics in the extended Kalman filter has been explored and compared to the classical white noise 
approach. To this end, observations system simulation experiments with the extended Kalman filter have been 
performed in the Lorenz model (Lorenz, 1996). The numerical analysis allowed us to assert that there is a clear 
indication of a substantial improvement of the filter accuracy (in terms of analysis error variance) when model 
error is assumed to act as a deterministic process. The filter performance has been further examined by varying 
either the assimilation intervals or the observational network properties (i.e. the observation number and error 
variance). As long as the assimilation interval is short enough the filter employing the quadratic model error 
evolution law (deterministic process assumption) outperforms systematically the white noise case. 

The existence of this universal, short time, quadratic law for the evolution of model error covariance might 
turn to be useful in all the situations in which access to some statistical information on the parametric model 
error is available and can then be used to estimate the model error covariance matrix. At the same time, if, 
for a given assimilation interval, an optimal model error covariance matrix is at hand, the optimal matrix at 
different assimilation intervals can be evaluated straightforwardly on the basis of the quadratic law. 

It would be interesting to apply the deterministic approach outlined in this work to models of increasing 
complexity and for parametric errors which have a direct physical interpretation. Furthermore, a number of 
more fundamental questions remain to be addressed concerning the way to account for other types of model 
errors. In the present study, the focus was placed on parametric errors assuming that the model and the 
reference system span the same phase space. In a realistic setting, such as for instance numerical weather 
prediction, errors arise also from the effect of the unresolved scales as well as from the lack of the description 
of some relevant phenomena. These problems will be addressed in future work. 
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Table 1: Summary of the model parameters values used in the experiments. The parametric model error is 
expressed as a relative percentage of the correct, "true", parameters, A = (a,/3,F) = (1, 1,8). The leading and 
smallest Lyapunov exponents a max and a min are expressed in (day) -1 . N CT +: number of positive Lyapunov 
exponents; KY — dim Kaplan- Yorke dimension. Results are based on a 200 years long integration after a 10 
years long spin-up period. 
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Table 2: EKF in perfect model scenario - P m = 0. Time average analysis error variance. Model parametric 
errors are given in the first column. Assimilation interval is given in the first row. All results are computed over 
5 years of simulated time after a 1-year long spin-up period. Error are expressed as percentage of the system 
climate variance; "div" stands for filter divergence. 
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0.67% 


^ = 0, ^ = 0, ^ = 20% 


0.91% 


0.77% 


0.66% 


Ci 


2.39% 


1.93% 


1.69% 


Cn 


3.59% 


3.15% 


2.75% 



Table 3: EKF with white-noise model error assumption - P m = P™„. Time mean analysis error variance. The 
linear, white noise, approximation, Eq. (|32p is employed. Model parametric errors are given in the first column. 
Assimilation interval is given in the first row. All results are computed over 5 years of simulated time after a 
1-year long spin-up period. Error are expressed as percentage of the system climate variance. 



25 



AX 
A 


r = 12 hrs 


r = 6 hrs 


t = 3 hrs 


= _ 20 %, ^ = 0, ^ = 


2.54% 
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0.97% 
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0.77% 


= 0, M = 0, ^ = -20% 


0.92% 


0.77% 


0.67% 


4P = 0, ^ = 0, ^ = 20% 


0.91% 


0.77% 


0.66% 


Cj 


2.17% 


1.80% 


1.60% 


C/7 


3.02% 


2.45% 


1.99% 



Table 4: EKF with deterministic process assumption - P m = P^p- Time mean analysis error variance. The 
quadratic, deterministic noise, approximation, Eq. (|3ip is employed. Model parametric errors are given in the 
first column. Assimilation interval is given in the first row. All results are computed over 5 years of simulated 
time after a 1-year long spin-up period. Error are expressed as percentage of the system climate variance. 
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Figure 1: Mean error evolution for configurations Ci (top panel) and Cu (bottom panel). Actual error (con- 
tinuous line); linearly evolved mean initial condition error, < M t to (5xo >, (dotted line); total error, model plus 
initial condition, approximated by Eq. (|20p (dashed line). Error is shown as a function of the model gridpoints 
at times t = 0, 12, 24 hours. 
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Figure 2: Mean square error (trace of the covariance matrix) evolution for different model errors, indicated in 
each panels. Actual error (continuous line); initial condition error - linear dynamics < (M t . to <5xo)(M tito <5xo) T > 
(dotted line); total error approximated by Eq. (|2ip (dashed line). All values are normalized with the true system 
climate variance. 
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Figure 3: Error covariance evolution between points 1 and 36 (left panels) and 20 and 30 (right panels). Actual 
error (continuous lines); initial condition error - linear dynamics M tito < (<5xo)(<5xo) T > M^ t (dotted lines) 
and the total error approximated by Eq. (fT7|) (dashed lines). Parametric model error configuration Cj (top 
panels) and C/j (bottom panels). The values are normalized with the nature climate variance. 
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Figure 4: Time running mean of the EKF analysis error variance. The EKF is employed in the perfect model 
scenario applied on an imperfect model. Parametric model error configurations perfect (top panel), Ci (mid 
panel), C/j (bottom panel). Assimilation intervals, r = 12 hours (dash-dotted lines), 6 hours (continuous lines) 
and 3 hours (dashed lines). Values are normalized with the nature climate variance. 
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Figure 5: Time running mean EKF analysis error variance for which model error is considered as a white 
noise, Eq. (|32p (left column) and a deterministic process, Eq. pip (right column). Parametric model error 
configurations: Ci (top panels) and Cjj (bottom panels). The assimilation interval is fixed to r = 12 hours 
(dash-dotted lines), 6 hours (continuous lines) and 3 hours (dashed lines). The values are normalized with the 
nature climate variance. 
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Figure 6: EKF analysis error variance as a function of time for the experiment with r =6 hours, during the last 
year of the simulation. Model error treatment: white noise hypothesis, Eq. (j3"2"|) (left column), deterministic 
process, Eq. ([3"Tj) (right column). Parametric model error configurations: Cj (top panels) and Cu (bottom 
panels). The values are normalized with the nature climate variance. 
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Figure 7: EKF experiments with r =6 hours - xi 5 as a function of time for model configuration C/j. The true 
system (continuous lines), observations (circles), the EKF solution with the white noise hypothesis for model 
error (dotted lines) and th EKF solution with the deterministic process hypothesis for model error (dash-dotted 
lines), are plotted. The four plots display the field values during the first week of the I th , 4 th , 7 th and 10 th 
month of the last year of simulated time. 
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Figure 8: EKF time average analysis error variance as a function of the coefficient 7 of the model error covariance 
matrix, P m = jP™ p - The parametric model error configuration is C/j and the assimilation interval is fixed to 
t = 3 hours (circles), r = 6 hours (triangles) and r = 12 (squares). The values are normalized with the nature 
climate variance. Note that the x-axis is in logaritmic scale. 
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Figure 9: EKF time average analysis error variance as a function of the observation error variance (expressed as 
a percentage of the system's climate variance) for parametric model error configuration C/j. The assimilation 
interval is fixed to t = 3 hours (circles), r = 6 hours (triangles) and r = 12 (squares). The different lines refer 
to the perfect assumption (dashed lines), the white noise assumption (continuous lines) and the deterministic 
process assumption (dash-dotted lines). The values are normalized with the nature climate variance. Note that 
the y-axis is in logaritmic scale. 
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Figure 10: EKF time average analysis error variance as a function of the number of observations for the 
parametric model error configuration C/j. The assimilation interval is fixed to r = 3 hours (circles), t = 6 
hours (triangles) and r = 12 (squares). The lines refer to the perfect assumption (dashed lines), the white noise 
assumption (continuous lines) and the deterministic process assumption (dash-dotted lines). The values are 
normalized with the nature climate variance. Note that the y-axis is in logaritmic scale. 
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